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IntrcKiuction : 

l>iring the preceding contract year, a variety of subtasks 
have been performed mostly in two areas: I) systems analysis 

and 2) algorithmic development, Ihe major effort in the 
systems analysis task (see Section II) was the development of a 
recommended apprtiach to the monitoring of restnirce utilization 
data for the Large Area Crop Inventory lixperiment (I.ACIE). 

Other efforts included participation in various studies concerning 
the LACIE Project Plan, the utilitx' of the Gli Image 100, and the 
specifications for a special purpose processor to be used in the 
LACIE. In the second task (see Section 111), the tnajor effort was 
the development of improved algorithms for estimating proportions 
of unclassified remotely sensed data. Also, work was performed 
on optimal feature extraction and optimal feature extraction for 
proportion estimation. 

This report summarizes the findings of these tasks. IXjtails 
of some of these tasks are to be found m ICSA technical reports 
referenced herein. 


Task 1 : Systems Analysis 

Ihis study developed a rationale and a method for a system 
to collect resource utilization (HU) data for the The 

method employed f«)r conducting this study was to adopt a "top- 
down" approach toward the design of such a system. The first 
step was to determine who would be the likely users of such data 
and what were the anticipated uses. Next, the types and amounts 
of data needed were determined from a detailed inspection of the 
proposed system. The last step was to devise a scheme for 
obtaining the data, getting it into the system, and providing for 
generation of appropriate reports as needed. Details of this study 
may be found in "The Resource Utilization Monitoring System for 
the Large Area Crop Inventory' l£xperiment~a Recommend Approach" 
by R. A. Sch>fer, D. Iv. Van Rooy, and M. S. Lynn, ICSA Report 
#275-025-020 and "Data Ikrse Design & Maintenance for the 
Resource Utilization Monitoring System, for the I.arge Area ('rop 
Inventory Experiment— a Recommended Approach," by R. A. 

Schafer, ICSA Report #275-025-02!. 

It was found that in general, the users of RU data can be 
classified into two groups differentially by their objectives: 

1. To operationally monitor the resource utiliza- 
tion of the system in order to spot processing 
bottlenecks, improve data flow, provide audit 
ami security facilities, etc. 

2. io post-hoc examine the resource uiilization of 
the system in order to determine the cost- 
effectiveness of the system and to provide appro- 
pri.*'-'* data to aid in the development of similar 
future systems. 
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Any or nil of the above suggesteJ uses of Kl' data may not 
presently, or in tlic future, be the intention of any group now 
connected with the LA(]I1£ and arc .mly suggestions as to the 
applicabilib’ of UU data in the two categories of usage. The 

presently known users of lUI data in categon' I are the LAC3E 
subsystem managers, tite Barth Observations Division Management 
Team and the Project Management Team. The only presently 
known user in category 2 is the DSDA, although the I’SDA will also 
be doing some operational monitoring. Other users not presently 
known would likely be organizations interested in performing I.ACIE- 
type functions on their own computer systems. 

Five types of RU data were identified: computer usage, 

manpower usage, materials usage, overhead and throughput rate. 

IXie to operational difficulties in determining a means of quantifying 
overhead, that type of Rll data was not included in the eventual 
system design. I he amount of data to be collected, due to the 
scope of the FACIE, was considerable: thus a method of organizing 
the raw data into a useable form was examined. A project 
accounting structure was decitlctl upon as a basis for this organization. 
A hierarchy of reporiing/accounting levels was established and an 
information retrieval system was described wliich utilized that 
structure. 

A structured data base was designed with the lowest level 
being the I.ACIE subsystem (or easily separable functions within a 
subsystem) and liigher levels being the geographic structure of 
LACIE, i.e. stratum, zone, region, and country. This data base 
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was designed to be kept on magnetic tape and the update procedure 
involved ct^ying an old master tape onto a new master tape 
accumulating data from an update tape. Iliis update tape would be 
created from RU data supplied to a Resource Data Manager. A 
set of forms for this data was describc'd as examples of the data 
to be collected. A sample set of reports vas also designed on the 
same basis. A major design objective of the data base and reporting 
system was to provide the flexibility to produce reports on 
arbitrary combinations of the RU data, since the current understand- 
ing of future needs was incomplete. 

Some programs were then written to gather some of the data 
(in particular from the FiRIPS DEi.OG rape). Also an information 
retrieval system available at Rice was used to produce sample 
reports using simulated data. 

Rice University personnel also participated in several f)ther 
systems analysis studies. Iliese included an examination of the 
General Electric Image 100 for use by the EOD as an applicate « 
development systems; participation in the design review of the 
LACIE; and a critique of the specifications for the special purpose 
processor to be used in the I.vXCdE 
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111, I ask II: Al^rorUlimic I Vvcloi-tmcnr 

Most of the effort on this task was ilevoted to the develop- 
ment and testing of rwo algorithms for estimating proportions of 
classes et>ntained in multispectral data. I he motivation for this 
work comes mostly from the LA(dh, where the total acreage of a 
crop, rather than its exact locatton, is one of the major quantitie.» 
of interest. 

The first method (see "Optimal Design of an Onsupervised 
Adaptive v.^'lassifier with Unknown Priors," by D. Kazakos, ICSA 
Report #27.S-()2S-0i;i) involves classification of the data w’ ile 
updating the estimate of the proportions. I o test this algorithm, 
a version for the special case of two classes was programmed 
and pseudo- random data was generated. The algorithm for this 
case is ; 
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The P 's are then boumled in the interval (0,1). 

'Hiis algoritiim was tested on some 2-class, one -dimensional 
pseudo- random data. It was found that the data needs to be taken 
in scrambled order, anti that the algorithm lias a tendency to 
"stick" to a boundary (P^^ Zii 1 or 0) in its present form. 

By bounding L(P^^), this "sticking" could be obviated, but the 
variance of the estimates was still considerably larger in some 
cases than the asymptotic error variance, even for as many as 
1000 points, d'hese experiments letl us to the conclusion that the 
finite sample bias of this algorithm is probably too large for use 
in LACIl£, so further ticvelopu.ent of this algorithm was ceased. 

Subsequently, a new method for proportion estimation was 
develojx'd (sec " Recursive Estimation of Prior Probabilities Using 
the Mixture Appnxich, " by I). Knzakos, K'SA Report #275-025-019). 
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Tills alporithm uses a recursive estimate of the prior probabilities 
to achieve results comparable to those of maximum likelihtxKl estima- 
tion (the results sluxild be the same f(jr the special case of 2 
classes) though the former is much easier tt implement and compe- 
lationally more efficient. Tins algorithm was tested using both 
M > 2 class and N > I dimensional pseudo- random and Hill 
County LANI.ISA'1 data (the same data used in "An limplrical 
Comparison of hive Proportion Estimators," by W. A, C’oberly and 
P. L. Odell, Annual Ilepc tt for NASA contract NAS 9-13512 for the 
University of Texas at ;>allas). 'nie function L(P^) which 
controls the error variance was found to be too complex to evaluate 
at each step. therefore, tlie algoritlim was modified whereby a 
constant I. was used and this constant depended only on the 
statistics of the classes. This approach will produce some degra- 
dation in the estimate of the proportions. Other mcxlifications of 
the algorithm include the boumling ami renormalizing of the current 
estimate of the prior prolxibilities at each step, ami tlie scrambling 
of the order of the data so as to prevent blocks of data belonging 
to single classes from "confusing" the estimator. 

I he results of the testing on Hill ('ounty data is given in 
Table 1. I’or this case, the calculated value of L was - 11.5. 
Other values of 1. were also used since it was felt that L, 
probably should be restricted to L 10. The maximum likelihood 
estimate obtained by ('obt'rly and Odell for this case is also given. 
Iherc is some doubt about the true proportions of wheat and barley 
since one "wheal" field is consistently classified as tvirley; so the 
numbers in parentheses refer to the proportions if that field really 
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is barley. However, the mean -squared error (MSF) is given about 
the other proportions. These results indicate that the recursive 
estimator can achieve results compara^’i to the maximum likelihcxxJ 
one, but tlie problem of what value to choose for L remains. 
Presently, we are investigating otiicr approximations to the function 
L which could alleviate this difficulty. Further development and 
testing including a timing comparison with the maximum likelihood 
estimator and a further studv into the requirements for shuffling 
the data is now underway, and a report will be Issued on the 
results obtained. Vve believe that this estimator is the most 
promising of the two developed here and therefore recommend that 
all the development effort be put into this cstimati^r rather than the 
first one. 

A related project was some preliminary development of an 
algorithm for optimal feature extraction for estimating proportioris. 
For the special case of two Gaussian classes, an expression was 
derived for an upper bound of the error variance when optimally 
estimating proportions. 'Hiis bound is expressed in terms of the 
Bhattacharrya distance, and it was shown that maximizing the 
Bhattacha rrya distance minimi.’es tliis bound. ITius, existing 
feature extraction algorithms (the University of Houston one) may 
be used for this special case. A report on this work is in 
preparation. I>ie to the importance of this effort for L-ACIE and 
other projects, we recommend that the EOl) support or perform 
further research and development in this area for the more general 
case of m classes. Tnis could reduce computation costs and 
provide bounds for the error to be used in estimating the total 
error in the acreage estimate. 
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Wheat 

I'allitw 

Barley 

Grass 

Stubble 

MSE 

Irue 

Proportions 

(. 302) 
. 36b 

. 2S6 

(. 179) 
.115 

.079 

. 147 

— 

Maximum 

Likelih(xxl 

. .3(K) 

. 2Q7 

. 177 

.086 

. 140 

.010 

Recursive Estimate 







L=ll.5 

. 286 

. 2'M) 

. 142 

. 101 

. 241 

.022 

L=7 

. :i08 

. 270 

. 164 

.067 

. 190 

.012 

L=3 

.313 

. 267 

. 1H2 

.061 

. 177 

.010 

L=1 

. .305 

. 292 

. 192 

.084 

.126 

.012 


Hill County Data 
Table 1 
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rurthor devchipincnr on another feature extraction algorithm 
t(K)k place anti Home initial touting was Jone during tliiu contract 
year. lliiu algorithm minimizes the increased risk of ntiuclausi- 
fication (see "Optimal Linear and Nonlinear Risk of Mlsclasslficarion, " 
by U. |. P. de I'igueir ido, IC!SA Report #275-()2.‘>-()1 4, llnu work 
has Iven Jointly sponsored with the U. S. Army under contract 
OA-3I-124- \RO-I)-462, tfie U. S. Air I'orce under contract 
AFOSR-7S-2777, ami the NSl' under grant GK-36375). Progress 
has been sltwver than expected in the testing phase, but presently 
the algoiitlim yields satisfactory results for the special case of a 
linear transformation from n dimensions to one. Testing will 
continue on another program that treats the more general n - k 
dimensionality reduction. A report on the results of the first 
program is being prepared and will bt' available shortly. Another 
report will Iv issued following development and testing of the 
second program. 


